Improvements to profile PSTMM for glycan recognition profile prediction
نویسندگان
چکیده
Glycans are biomolecules that are composed of various monosaccharides, and they are bound to proteins and lipids on the cell surface. Glycans are known as key players of biological phoenomena such as in the determination of blood type, cell adhesion, antigen-antibody reactions, virus infections, etc. It is known that more than half of the proteins in major protein structure database such as SWISS-PROT are glycosylated. Unlike proteins, however, glycans are tree structures which vary across organs, tissues, and even cells. Lectins are glycan-binding proteins that recognize monosaccharides at the non-reducing end, and it is believed that they may also recognize sugars further along the glycan chain. To capture these potential recognition structures, PSTMM (Probabilistic Sibling-dependent Tree Markov Model), which considers relationships between “sibling” monosaccharides within glycans, was developed. Furthermore, a profile version of PSTMM, called profile PSTMM, was developed [1], which has been implemented as a web tool last year. Profile PSTMM requires the definition of a “state model” which defines the structure of the profile to be learned from the training data. The state model of the profile PSTMM web tool is currently generated based on the maximum common subtree (MCST) of all input glycan structure. However, this results in a very small state model. In order to compensate for this, we are developing a method to multiply align all glycan structures and extract glycan substructure blocks based on a method similar to ClustalW. We introduce this new method of multiple tree alignment here.
منابع مشابه
Isolated Persian/Arabic handwriting characters: Derivative projection profile features, implemented on GPUs
For many years, researchers have studied high accuracy methods for recognizing the handwriting and achieved many significant improvements. However, an issue that has rarely been studied is the speed of these methods. Considering the computer hardware limitations, it is necessary for these methods to run in high speed. One of the methods to increase the processing speed is to use the computer pa...
متن کاملAn Automatic Glyco-Workflow Generator in RINGS
Recently, databases of glycans have increased in the bioinformatics field. RINGS [1] is a resource that requires a comprehensive glycan database such that the glycobiologist can freely use glycan analysis tools with convenience. RINGS contains utility tools, most of which are conversion utilities for different glycan file formats. For example, there are the conversion tools that convert KEGG Gl...
متن کاملPrediction of Temperature Profile of a Buried Gas Pipeline Through Utilization of Corresponding States Principle
A new analytical equation for prediction of temperature profile of a buried gas pipeline is developed. Utility of this equation is illustrated by its application to corresponding states principle. The resulting equation is tested through prediction of the actual Schorre data. It is shown that the new equation can predict temperature profile more accurately than the others without using any char...
متن کاملMultivariate Feature Extraction for Prediction of Future Gene Expression Profile
Introduction: The features of a cell can be extracted from its gene expression profile. If the gene expression profiles of future descendant cells are predicted, the features of the future cells are also predicted. The objective of this study was to design an artificial neural network to predict gene expression profiles of descendant cells that will be generated by division/differentiation of h...
متن کاملApplication of a new probabilistic model for recognizing complex patterns in glycans
MOTIVATION The study of carbohydrate sugar chains, or glycans, has been one of slow progress mainly due to the difficulty in establishing standard methods for analyzing their structures and biosynthesis. Glycans are generally tree structures that are more complex than linear DNA or protein sequences, and evidence shows that patterns in glycans may be present that spread across siblings and into...
متن کامل